Goto

Collaborating Authors

 Fulton County


'Minecraft' movie mayhem raises alarms for America's youth, 'bad for society': expert

FOX News

"A Minecraft Movie," the big-screen adaptation of the popular video game "Minecraft," has been packing theaters with rowdy kids and teens since its release this month, spurring a social media phenomenon and sparking concern for America's youth. Videos on social media show young theatergoers huge reactions to one key scene, where one of the film's stars, Jack Black, yells out the phrase "Chicken Jockey!" as a small, Frankenstein-looking creature lands on top of a chicken in a boxing ring to face off with co-star Jason Momoa. The scene has prompted excited fans to scream, shout, throw popcorn around, jump up out of their seats, and in one instance in Provo, Utah, toss a live chicken in the air during a screening, according to the Salt Lake Tribune. Springs Cinema & Taphouse in Sandy Springs, Georgia, told FOX 5 Atlanta that its staff has had to clean up popcorn, ICEEs, ketchup and shattered glass. The scene featuring the "Chicken Jockey" in "A Minecraft Movie" has spawned some chaotic movie theater behavior from young audiences. "The movie-going experience has changed a lot since I was younger," Josh Gunderson, director of marketing and events at Oviedo Mall in Florida, told FOX Business.


Tempo: Helping Data Scientists and Domain Experts Collaboratively Specify Predictive Modeling Tasks

arXiv.org Artificial Intelligence

Temporal predictive models have the potential to improve decisions in health care, public services, and other domains, yet they often fail to effectively support decision-makers. Prior literature shows that many misalignments between model behavior and decision-makers' expectations stem from issues of model specification, namely how, when, and for whom predictions are made. However, model specifications for predictive tasks are highly technical and difficult for non-data-scientist stakeholders to interpret and critique. To address this challenge we developed Tempo, an interactive system that helps data scientists and domain experts collaboratively iterate on model specifications. Using Tempo's simple yet precise temporal query language, data scientists can quickly prototype specifications with greater transparency about pre-processing choices. Moreover, domain experts can assess performance within data subgroups to validate that models behave as expected. Through three case studies, we demonstrate how Tempo helps multidisciplinary teams quickly prune infeasible specifications and identify more promising directions to explore.


ToolQA: A Dataset for LLM Question Answering with External Tools Yuchen Zhuang, Haotian Sun, Chao Zhang College of Computing, Georgia Institute of Technology, Atlanta GA

Neural Information Processing Systems

Large Language Models (LLMs) have demonstrated impressive performance in various NLP tasks, but they still suffer from challenges such as hallucination and weak numerical reasoning. To overcome these challenges, external tools can be used to enhance LLMs' question-answering abilities. However, current evaluation methods do not distinguish between questions that can be answered using LLMs' internal knowledge and those that require external information through tool use. To address this issue, we introduce a new dataset called ToolQA, which is designed to faithfully evaluate LLMs' ability to use external tools for question answering. Our development of ToolQA involved a scalable, automated process for dataset curation, along with 13 specialized tools designed for interaction with external knowledge in order to answer questions. Importantly, we strive to minimize the overlap between our benchmark data and LLMs' pre-training data, enabling a more precise evaluation of LLMs' tool-use reasoning abilities. We conducted an in-depth diagnosis of existing tool-use LLMs to highlight their strengths, weaknesses, and potential improvements. Our findings set a new benchmark for evaluating LLMs and suggest new directions for future advancements.


Use of Winsome Robots for Understanding Human Feedback (UWU)

arXiv.org Artificial Intelligence

As social robots become more common, many have adopted cute aesthetics aiming to enhance user comfort and acceptance. However, the effect of this aesthetic choice on human feedback in reinforcement learning scenarios remains unclear. Previous research has shown that humans tend to give more positive than negative feedback, which can cause failure to reach optimal robot behavior. We hypothesize that this positive bias may be exacerbated by the robot's level of perceived cuteness. To investigate, we conducted a user study where participants critique a robot's trajectories while it performs a task. We then analyzed the impact of the robot's aesthetic cuteness on the type of participant feedback. Our results suggest that there is a shift in the ratio of positive to negative feedback when perceived cuteness changes. In light of this, we experiment with a stochastic version of TAMER which adapts based on the user's level of positive feedback bias to mitigate these effects.


CAPE: Covariate-Adjusted Pre-Training for Epidemic Time Series Forecasting

arXiv.org Artificial Intelligence

Accurate forecasting of epidemic infection trajectories is crucial for safeguarding public health. However, limited data availability during emerging outbreaks and the complex interaction between environmental factors and disease dynamics present significant challenges for effective forecasting. In response, we introduce CAPE, a novel epidemic pre-training framework designed to harness extensive disease datasets from diverse regions and integrate environmental factors directly into the modeling process for more informed decision-making on downstream diseases. Based on a covariate adjustment framework, CAPE utilizes pre-training combined with hierarchical environment contrasting to identify universal patterns across diseases while estimating latent environmental influences. We have compiled a diverse collection of epidemic time series datasets and validated the effectiveness of CAPE under various evaluation scenarios, including full-shot, few-shot, zero-shot, cross-location, and cross-disease settings, where it outperforms the leading baseline by an average of 9.9% in full-shot and 14.3% in zero-shot settings. The code will be released upon acceptance.


Multi-Aggregator Time-Warping Heterogeneous Graph Neural Network for Personalized Micro-Video Recommendation

arXiv.org Artificial Intelligence

Micro-video recommendation is attracting global attention and becoming a popular daily service for people of all ages. Recently, Graph Neural Networks-based micro-video recommendation has displayed performance improvement for many kinds of recommendation tasks. However, the existing works fail to fully consider the characteristics of micro-videos, such as the high timeliness of news nature micro-video recommendation and sequential interactions of frequently changed interests. In this paper, a novel Multi-aggregator Time-warping Heterogeneous Graph Neural Network (MTHGNN) is proposed for personalized news nature micro-video recommendation based on sequential sessions, where characteristics of micro-videos are comprehensively studied, users' preference is mined via multi-aggregator, the temporal and dynamic changes of users' preference are captured, and timeliness is considered. Through the comparison with the state-of-the-arts, the experimental results validate the superiority of our MTHGNN model.


The ELEVATE-AI LLMs Framework: An Evaluation Framework for Use of Large Language Models in HEOR: an ISPOR Working Group Report

arXiv.org Artificial Intelligence

Introduction. Generative Artificial Intelligence, particularly large language models (LLMs), offers transformative potential for Health Economics and Outcomes Research (HEOR). However, evaluating the quality, transparency, and rigor of LLM-assisted research lacks standardized guidance. This article introduces the ELEVATE AI LLMs framework and checklist, designed to support researchers and reviewers in assessing LLM use in HEOR. Methods. The ELEVATE AI LLMs framework was developed through a targeted review of existing guidelines and evaluation frameworks. The framework comprises ten evaluation domains, including model characteristics, accuracy, comprehensiveness, and fairness. The accompanying checklist operationalizes the framework. To validate the framework, we applied it to two published studies, demonstrating its usability across different HEOR tasks. Results. The ELEVATE AI LLMs framework provides a comprehensive structure for evaluating LLM-assisted research, while the checklist facilitates practical application. Validation of the framework and checklist on studies of systematic literature reviews and health economic modeling highlighted their ability to identify strengths and gaps in reporting. Limitations. While the ELEVATE AI LLMs framework provides robust guidance, its broader generalizability and applicability to diverse HEOR tasks require further empirical testing. Additionally, several metrics adapted from computer science need further validation in HEOR contexts. Conclusion. The ELEVATE AI LLMs framework and checklist fill a critical gap in HEOR by offering structured guidance for evaluating LLM-assisted research. By promoting transparency, accuracy, and reproducibility, they aim to standardize and improve the integration of LLMs into HEOR, ensuring their outputs meet the field's rigorous standards.


Thirteen proteins in your blood could reveal the age of your brain

New Scientist

Researchers trained an artificial intelligence model to gauge people's ages from their brain scans The abundance of 13 proteins in your blood seems to be a strong indicator of how rapidly your brain is ageing. This suggests that blood tests could one day help people track and even boost their brain health. Most previous studies that have looked at protein markers of brain ageing in the blood have involved fewer than 1000 people, says Nicholas Seyfried at Emory University in Atlanta, Georgia, who wasn't involved in the new research. To get a broader idea of the impact of these proteins, Wei-Shi Liu at Fudan University in China and his colleagues analysed MRI brain scan data from nearly 11,000 adults from the UK Biobank project, whose ages ranged from around 50 to 80 at the time of imaging. Using data from 70 per cent of the participants, Liu's team trained an artificial intelligence model to predict how old the participants were based on features of the brain images, such as the size of different brain regions and how distinct parts connected to each other.


EARN Fairness: Explaining, Asking, Reviewing and Negotiating Artificial Intelligence Fairness Metrics Among Stakeholders

arXiv.org Artificial Intelligence

Numerous fairness metrics have been proposed and employed by artificial intelligence (AI) experts to quantitatively measure bias and define fairness in AI models. Recognizing the need to accommodate stakeholders' diverse fairness understandings, efforts are underway to solicit their input. However, conveying AI fairness metrics to stakeholders without AI expertise, capturing their personal preferences, and seeking a collective consensus remain challenging and underexplored. To bridge this gap, we propose a new framework, EARN Fairness, which facilitates collective metric decisions among stakeholders without requiring AI expertise. The framework features an adaptable interactive system and a stakeholder-centered EARN Fairness process to Explain fairness metrics, Ask stakeholders' personal metric preferences, Review metrics collectively, and Negotiate a consensus on metric selection. To gather empirical results, we applied the framework to a credit rating scenario and conducted a user study involving 18 decision subjects without AI knowledge. We identify their personal metric preferences and their acceptable level of unfairness in individual sessions. Subsequently, we uncovered how they reached metric consensus in team sessions. Our work shows that the EARN Fairness framework enables stakeholders to express personal preferences and reach consensus, providing practical guidance for implementing human-centered AI fairness in high-risk contexts. Through this approach, we aim to harmonize fairness expectations of diverse stakeholders, fostering more equitable and inclusive AI fairness.


Introducing the Biomechanics-Function Relationship in Glaucoma: Improved Visual Field Loss Predictions from intraocular pressure-induced Neural Tissue Strains

arXiv.org Artificial Intelligence

Objective. (1) To assess whether neural tissue structure and biomechanics could predict functional loss in glaucoma; (2) To evaluate the importance of biomechanics in making such predictions. Design, Setting and Participants. We recruited 238 glaucoma subjects. For one eye of each subject, we imaged the optic nerve head (ONH) using spectral-domain OCT under the following conditions: (1) primary gaze and (2) primary gaze with acute IOP elevation. Main Outcomes: We utilized automatic segmentation of optic nerve head (ONH) tissues and digital volume correlation (DVC) analysis to compute intraocular pressure (IOP)-induced neural tissue strains. A robust geometric deep learning approach, known as Point-Net, was employed to predict the full Humphrey 24-2 pattern standard deviation (PSD) maps from ONH structural and biomechanical information. For each point in each PSD map, we predicted whether it exhibited no defect or a PSD value of less than 5%. Predictive performance was evaluated using 5-fold cross-validation and the F1-score. We compared the model's performance with and without the inclusion of IOP-induced strains to assess the impact of biomechanics on prediction accuracy. Results: Integrating biomechanical (IOP-induced neural tissue strains) and structural (tissue morphology and neural tissues thickness) information yielded a significantly better predictive model (F1-score: 0.76+-0.02) across validation subjects, as opposed to relying only on structural information, which resulted in a significantly lower F1-score of 0.71+-0.02 (p < 0.05). Conclusion: Our study has shown that the integration of biomechanical data can significantly improve the accuracy of visual field loss predictions. This highlights the importance of the biomechanics-function relationship in glaucoma, and suggests that biomechanics may serve as a crucial indicator for the development and progression of glaucoma.